9 research outputs found

    Anthropometry: An R Package for Analysis of Anthropometric Data

    Get PDF
    The development of powerful new 3D scanning techniques has enabled the generation of large up-to-date anthropometric databases which provide highly valued data to improve the ergonomic design of products adapted to the user population. As a consequence, Ergonomics and Anthropometry are two increasingly quantitative fields, so advanced statistical methodologies and modern software tools are required to get the maximum benefit from anthropometric data. This paper presents a new R package, called Anthropometry, which is available on the Comprehensive R Archive Network. It brings together some statistical methodologies concerning clustering, statistical shape analysis, statistical archetypal analysis and the statistical concept of data depth, which have been especially developed to deal with anthropometric data. They are proposed with the aim of providing effective solutions to some common anthropometric problems, such as clothing design or workstation design (focusing on the particular case of aircraft cockpits). The utility of the package is shown by analyzing the anthropometric data obtained from a survey of the Spanish female population performed in 2006 and from the 1967 United States Air Force survey. This manuscript is also contained in Anthropometry as a vignette

    Archetypoid analysis for sports analytics

    Get PDF
    We intend to understand the growing amount of sports performance data by finding extreme data points, which makes human interpretation easier. In archetypoid analysis each datum is expressed as a mixture of actual observations (archetypoids). Therefore, it allows us to identify not only extreme athletes and teams, but also the composition of other athletes (or teams) according to the archetypoid athletes, and to establish a ranking. The utility of archetypoids in sports is illustrated with basketball and soccer data in three scenarios. Firstly, with multivariate data, where they are compared with other alternatives, showing their best results. Secondly, despite the fact that functional data are common in sports (time series or trajectories), functional data analysis has not been exploited until now, due to the sparseness of functions. In the second scenario, we extend archetypoid analysis for sparse functional data, furthermore showing the potential of functional data analysis in sports analytics. Finally, in the third scenario, features are not available, so we use proximities. We extend archetypoid analysis when asymmetric relations are present in data. This study provides information that will provide valuable knowledge about player/team/league performance so that we can analyze athlete’s careers.This work has been partially supported by Grant DPI2013-47279-C2-1-R. The databases and R code (including the web application) to reproduce the results can be freely accessed at www.uv.es/vivigui/software

    Development of statistical methodologies applied to anthropometric data oriented towards the ergonomic design of products

    Get PDF
    Ergonomics is the scientific discipline that studies the interactions between human beings and the elements of a system and presents multiple applications in areas such as clothing and footwear design or both working and household environments. In each of these sectors, knowing the anthropometric dimensions of the current target population is fundamental to ensure that products suit as well as possible most of the users who make up the population. Anthropometry refers to the study of the measurements and dimensions of the human body and it is considered a very important branch of Ergonomics because its considerable influence on the ergonomic design of products. Human body measurements have usually been taken using rules, calipers or measuring tapes. These procedures are simple and cheap to carry out. However, they have one major drawback: the body measurements obtained and consequently, the human shape information, is imprecise and inaccurate. Furthermore, they always require interaction with real subjects, which increases the measure time and data collecting. The development of new three-dimensional (3D) scanning techniques has represented a huge step forward in the way of obtaining anthropometric data. This technology allows 3D images of human shape to be captured and at the same time, generates highly detailed and reproducible anthropometric measurements. The great potential of these new scanning systems for the digitalization of human body has contributed to promoting new anthropometric studies in several countries, such as United Kingdom, Australia, Germany, France or USA, in order to acquire accurate anthropometric data of their current population. In this context, in 2006 the Spanish Ministry of Health commissioned a 3D anthropometric survey of the Spanish female population, following the agreement signed by the Ministry itself with the Spanish associations and companies of manufacturing, distribution, fashion design and knitted sectors. A sample of 10415 Spanish females from 12 to 70 years old, randomly selected from the official Postcode Address File, was measured. The two main objectives of this study, which was conducted by the Biomechanics Institute of Valencia, were the following: on the one hand, to characterize the shape and body dimensions of the current Spanish women population to develop a standard sizing system that could be used by all clothing designers. On the other hand, to promote a healthy image of beauty through the representation of suited mannequins. In order to tackle both objectives, Statistics plays an essential role. Thus, the statistical methodologies presented in this PhD work have been applied to the database obtained from the Spanish anthropometric study. Clothing sizing systems classify the population into homogeneous groups (size groups) based on some key anthropometric dimensions. All members of the same group are similar in body shape and size, so they can wear the same garment. In addition, members of different groups are very different with respect to their body dimensions. An efficient and optimal sizing system aims at accommodating as large a percentage of the population as possible, in the optimum number of size groups that better describes the shape variability of the population. Besides, the garment fit for the accommodated individuals must be as good as possible. A very valuable reference related to sizing systems is the book Sizing in clothing: Developing effective sizing systems for ready-to-wear clothing, by Susan Ashdown. Each clothing size is defined from a person whose body measurements are located toward the central value for each of the dimensions considered in the analysis. The central person, which is considered as the size representative (the size prototype), becomes the basic pattern from which the clothing line in the same size is designed. Clustering is the statistical tool that divides a set of individuals in groups (clusters), in such a way that subjects of the same cluster are more similar to each other than to those in other groups. In addition, clustering defines each group by means of a representative individual. Therefore, it arises in a natural way the idea of using clustering to try to define an efficient sizing system. Specifically, four of the methodologies presented in this PhD thesis aimed at segmenting the population into optimal sizes, use different clustering methods. The first one, called trimowa, has been published in Expert Systems with Applications. It is based on using an especially defined distance to examine differences between women regarding their body measurements. The second and third ones (called biclustAnthropom and TDDclust, respectively) will soon be submitted in the same paper. BiclustAnthropom adapts to the field of Anthropometry a clustering method addressed in the specific case of gene expression data. Moreover, TDDclust uses the concept of statistical depth for grouping according to the most central (deep) observation in each size. As mentioned, current sizing systems are based on using an appropriate set of anthropometric dimensions, so clustering is carried out in the Euclidean space. In the three previous proposals, we have always worked in this way. Instead, in the fourth and last approach, called kmeansProcrustes, a clustering procedure is proposed for grouping taking into account the women shape, which is represented by a set of anatomical markers (landmarks). For this purpose, the statistical shape analysis will be fundamental. This contribution has been submitted for publication. A sizing system is intended to cover the so-called standard population, discarding the individuals with extreme sizes (both large and small). In mathematical language, these individuals can be considered outliers. An outlier is an observation point that is distant from other observations. In our case, a person with extreme anthopometric measurements would be considered as a statistical outlier. Clothing companies usually design garments for the standard sizes so that their market share is optimal. Nevertheless, with their foreign expansion, a lot of brands are spreading their collection and they already have a special sizes section. In last years, Internet shopping has been an alternative for consumers with extreme sizes looking for clothes that follow trends. The custom-made fabrication is other possibility with the advantage of making garments according to the customers' preferences. The four aforementioned methodologies (trimowa, biclustAnthropom, TDDclust and kmeansProcrustes) have been adapted to only accommodate the standard population. Once a particular garment has been designed, the assessing and analysis of fit is performed using one or more fit models. The fit model represents the body dimensions selected by each company to define the proportional relationships needed to achieve the fit the company has determined. The definition of an efficient sizing system relies heavily on the accuracy and representativeness of the fit models regarding the population to which it is addressed. In this PhD work, a statistical approach is proposed to identify representative fit models. It is based on another clustering method originally developed for grouping gene expression data. This method, called hipamAnthropom, has been published in Decision Support Systems. From well-defined fit models and prototypes, representative and accurate mannequins of the population can be made. Unlike clothing design, where representative cases correspond with central individuals, in the design of working and household environments, the variability of human shape is described by extreme individuals, which are those that have the largest or smallest values (or extreme combinations) in the dimensions involved in the study. This is often referred to as the accommodation problem. A very interesting reference in this area is the book entitled Guidelines for Using Anthropometric Data in Product Design, published by The Human Factors and Ergonomics Society. The idea behind this way of proceeding is that if a product fits extreme observations, it will also fit the others (less extreme). To that end, in this PhD thesis we propose two methodological contributions based on the statistical archetypal analysis. An archetype in Statistics is an extreme individual that is obtained as a convex combination of other subjects of the sample. The first of these methodologies has been published in Computers and Industrial Engineering, whereas the second one has been submitted for publication. The outline of this PhD report is as follows: Chapter 1 reviews the state of the art of Ergonomics and Anthropometry and introduces the anthropometric survey of the Spanish female population. Chapter 2 presents the trimowa, biclustAnthropom and hipamAnthropom methodologies. In Chapter 3 the kmeansProcrustes proposal is detailed. The TDDclust methodology is explained in Chapter 4. Chapter 5 presents the two methodologies related to the archetypal analysis. Since all these contributions have been programmed in the statistical software R, Chapter 6 presents the Anthropometry R package, that brings together all the algorithms associated with each approach. In this way, from Chapter 2 to Chapter 6 all the methodologies and results included in this PhD thesis are presented. At last, Chapter 7 provides the most important conclusions

    Archetypal analysis: contributions for estimating boundary cases in multivariate accommodation problem

    Get PDF
    The use of archetypal analysis is proposed in order to determine a set of representative cases that entail a certain percentage of the population, in the accommodation problem. A well-known anthropometric database has been used in order to compare our methodology with the common used PCA- approach, showing the advantages of our methodology: the level of accom- modation is reached unlike the PCA approach, no more adjustments are nec- essary, the user can decide the number of archetypes to consider or leave the selection by a criterion. Unlike PCA, the objective of the archetypal analysis is obtaining extreme individuals, so it is the appropriate statistical technique for solving this type of problem. Archetypes cannot be obtained with PCA even if we consider all the components, as we show in the application

    Archetypoids: A new approach to define representative archetypal data

    Get PDF
    [EN] The new concept archetypoids is introduced. Archetypoid analysis represents each observation in a dataset as a mixture of actual observations in the dataset, which are pure type or archetypoids. Unlike archetype analysis, archetypoids are real observations, not a mixture of observations. This is relevant when existing archetypal observations are needed, rather than fictitious ones. An algorithm is proposed to find them and some of their theoretical properties are introduced. It is also shown how they can be obtained when only dissimilarities between observations are known (features are unavailable). Archetypoid analysis is illustrated in two design problems and several examples, comparing them with the archetypes, the nearest observations to them and other unsupervised methods.The authors would like to thank Juan Domingo from the University of Valencia for providing the binary images of women’s trunks. They would also like to thank the Biomechanics Institute of Valencia for providing them with the dataset and the Spanish Ministry of Health and Consumer Affairs for having promoted and coordinated the ‘‘Anthropometric Study of the Female Population in Spain’’. The authors are also grateful to the Associate Editor and two reviewers for their very constructive suggestions, which have led to improvements in the manuscript. This work has been partially supported by Grant DPI2013-47279-C2-1-R.Vinue, G.; Epifanio, I.; Alemany Mut, MS. (2015). Archetypoids: A new approach to define representative archetypal data. Computational Statistics and Data Analysis. 87:102-115. https://doi.org/10.1016/j.csda.2015.01.018S1021158

    Twitter as a Tool for Teaching and Communicating Microbiology: The #microMOOCSEM Initiative

    Get PDF
    Online social networks are increasingly used by the population on a daily basis. They are considered a powerful tool for science communication and their potential as educational tools is emerging. However, their usefulness in academic practice is still a matter of debate. Here, we present the results of our pioneering experience teaching a full Basic Microbiology course via Twitter (#microMOOCSEM), consisting of 28 lessons of 40-45 minutes duration each, at a tweet per minute rate during 10 weeks. Lessons were prepared by 30 different lecturers, covering most basic areas in Microbiology and some monographic topics of general interest (malaria, HIV, tuberculosis, etc.). Data analysis on the impact and acceptance of the course were largely affirmative, promoting a 330% enhancement in the followers and a >350-fold increase of the number of visits per month to the Twitter account of the host institution, the Spanish Society for Microbiology. Almost one third of the course followers were located overseas. Our study indicates that Massive Online Open Courses (MOOC) via Twitter are highly dynamic, interactive, and accessible to great audiences, providing a valuable tool for social learning and communicating science. This strategy attracts the interest of students towards particular topics in the field, efficiently complementing customary academic activities, especially in multidisciplinary areas like Microbiology.VersiĂłn del edito

    The k -means algorithm for 3D shapes with an application to apparel design

    No full text
    Clustering of objects according to shapes is of key importance in many scientific fields. In this paper we focus on the case where the shape of an object is represented by a configuration matrix of landmarks. It is well known that this shape space has a finite-dimensional Riemannian manifold structure (non-Euclidean) which makes it difficult to work with. Papers about clustering on this space are scarce in the literature. The basic foundation of the k-means algorithm is the fact that the sample mean is the value that minimizes the Euclidean distance from each point to the centroid of the cluster to which it belongs, so, our idea is integrating the Procrustes type distances and Procrustes mean into the k-means algorithm to adapt it to the shape analysis context. As far as we know, there have been just two attempts in that way. In this paper we propose to adapt the classical k-means Lloyd algorithm to the context of Shape Analysis, focusing on the three dimensional case. We present a study comparing its performance with the Hartigan-Wong k-means algorithm, one that was previously adapted to the field of Statistical Shape Analysis. We demonstrate the better performance of the Lloyd version and, finally, we propose to add a trimmed procedure. We apply both to a 3D database obtained from an anthropometric survey of the Spanish female population conducted in this country in 2006. The algorithms presented in this paper are available in the Anthropometry R package, whose most current version is always available from the Comprehensive R Archive Network. Advances in Data Analysis and Classification Advances in Data Analysis and Classification Look Inside Article Metrics 1 Citation Other actions Export citation Register for Journal Updates About This Journal Reprints and Permissions Add to Papers Share Share this content on Facebook Share this content on Twitter Share this content on LinkedI

    The k-means algorithm for 3D shapes with an application to apparel design

    Full text link
    Clustering of objects according to shapes is of key importance in many scientific fields. In this paper we focus on the case where the shape of an object is represented by a configuration matrix of landmarks. It is well known that this shape space has a finite-dimensional Riemannian manifold structure (non-Euclidean) which makes it difficult to work with. Papers about clustering on this space are scarce in the literature. The basic foundation of the -means algorithm is the fact that the sample mean is the value that minimizes the Euclidean distance from each point to the centroid of the cluster to which it belongs, so, our idea is integrating the Procrustes type distances and Procrustes mean into the -means algorithm to adapt it to the shape analysis context. As far as we know, there have been just two attempts in that way. In this paper we propose to adapt the classical -means Lloyd algorithm to the context of Shape Analysis, focusing on the three dimensional case. We present a study comparing its performance with the Hartigan-Wong -means algorithm, one that was previously adapted to the field of Statistical Shape Analysis. We demonstrate the better performance of the Lloyd version and, finally, we propose to add a trimmed procedure. We apply both to a 3D database obtained from an anthropometric survey of the Spanish female population conducted in this country in 2006. The algorithms presented in this paper are available in the Anthropometry R package, whose most current version is always available from the Comprehensive R Archive Network.Vinue, G.; Simo, A.; Alemany Mut, MS. (2016). The k-means algorithm for 3D shapes with an application to apparel design. Advances in Data Analysis and Classification. 10(1):103-132. doi:10.1007/s11634-014-0187-1S103132101Alemany S, GonzĂĄlez JC, NĂĄcher B, Soriano C, ArnĂĄiz C, Heras H (2010) Anthropometric survey of the spanish female population aimed at the apparel industry. In: Proceedings of the 2010 Intl Conference on 3D Body scanning Technologies, Lugano, Switzerland, pp 1–10Amaral G, Dore L, Lessa R, Stosic B (2010) k-means algorithm in statistical shape analysis. Commun Stat Simul Comput 39(5):1016–1026Anderberg M (1973) Cluster analysis for applications. Academic Press, New YorkBest D, Fisher N (1979) Efficient simulation of the von mises distribution. J R Stat Soc Ser C (Appl Stat) 28(2):152–157Bhattacharya R, Patrangenaru V (2002) Nonparametric estimation of location and dispersion on riemannian manifolds. J Stat Plann Inference 108:23–35Bhattacharya R, Patrangenaru V (2003) Large sample theory of intrinsic and extrinsic sample means on manifolds. Ann Stat 31(1):1–29Bock HH (2007) Clustering methods: a history of k-means algorithms. In: Brito P, Bertrand P, Cucumel G, de Carvalho F (eds) Selected contributions in data analysis and classification. Springer, Berlin Heidelberg, pp 161–172Bock HH (2008) Origins and extensions of the k-means algorithm in cluster analysis. Electron J Hist Prob Stat 4(2):1–18Cai X, Li Z, Chang CC, Dempsey P (2005) Analysis of alignment influence on 3-D anthropometric statistics. Tsinghua Sci Technol 10(5):623–626Chernoff H (1970) Metric considerations in cluster analysis. In: Proc. 6th Berkeley Symposium on Mathematical Statistics and Probability. University of California Press, pp 621–629Chung M, Lina H, Wang MJJ (2007) The development of sizing systems for taiwanese elementary- and high-school students. Int J Ind Ergon 37:707–716Claude J (2008) Morphometrics with R. use R!. Springer, New YorkDryden IE, Mardia KV (1998) Statistical shape analysis. Wiley, ChichesterDryden IL (2012) Shapes package. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org , contributed packageEuropean Committee for Standardization. European Standard EN 13402–2: Size system of clothing. Primary and secondary dimensions (2002)Fletcher P, Lu C, Pizer S, Joshi S (2004) Principal geodesic analysis for the study of nonlinear statistics of shape. Med Imaging IEEE Trans 23:995–1005FrĂ©chet M (1948) Les Ă©lĂ©ments alĂ©atoires de nature quelconque dans un espace distanciĂ©. Ann Inst Henri Poincare Prob Stat 10(4):215–310GarcĂ­a-Escudero LA, Gordaliza A (1999) Robustness properties of k-means and trimmed k-means. J Am Stat Assoc 94(447):956–969Georgescu V (2009) Clustering of fuzzy shapes by integrating Procrustean metrics and full mean shape estimation into k-means algorithm. In: IFSA-EUSFLAT Conference (Lisbon, Portugal), pp 1679–1684Hand DJ, Krzanowski WJ (2005) Optimising k-means clustering results with standard software packages. Comput Stat Data Anal 49:969.973 short communicationHartiga JA, Wong MA (1979) A K-means clustering algorithm. Appl Stat 100–108Hastie T, Tibshirani R, Friedman J (2008) The elements of statistical learning. Springer, New YorkIbåñez MV, VinuĂ© G, Alemany S, SimĂł A, Epifanio I, Domingo J, Ayala G (2012) Apparel sizing using trimmed PAM and OWA operators. Expert Syst Appl 39:10,512–10,520Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recognit Lett 31:651–666Kanungo T, Mount DM, Netanyahu NS, Piatko C, Silverman R, Wu AY (2002) An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892Karcher H (1977) Riemannian center of mass and mollifier smoothing. Commun Pure Appl Math 30(5):509–541Kaufman L, Rousseeuw P (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New YorkKendall D (1977) The diffusion of shape. Adv Appl Prob 9:428–430Kendall DG, Barden D, Carne T, Le H (2009) Shape and shape theory. Wiley, ChichesterKendall WS (1990) Probability, convexity, and harmonic maps with small image i: uniqueness and fine existence. Proc Lond Math Soc 3(2):371–406Kent J, Mardia K (1997) Consistency of procrustes estimators. J R Stat Soc Ser B 59(1):281–290Kobayashi S, Nomizu K (1969) Foundations of differential geometry, vol 2. Wiley, ChichesterLawing A, Polly P (2010) Geometric morphometrics: recent applications to the study of evolution and development. J Zool 280(1):1–7Le H (1998) On the consistency of Procrustean mean shapes. Adv Appl Prob 30(1):53–63Lloyd SP (1957) Least squares quantization in pcm. bell telephone labs memorandum, murray hill, nj. reprinted. In: IEEE Trans Information Theory IT-28 (1982) 2:129–137MacQueen J (1967) Some methoods for classification and analysis of mulivariate observations. In: Proc 5th Berkely Symp Math Statist Probab. Univ of California Press B (ed) 1965/66, vol 1, pp 281–297Nazeer KAA, Sebastian MP (2009) Improving the accuracy and efficiency of the k-means clustering algorithm. In: Proceedings of the World Congress on Engineering (London, UK), pp 1–5Ng R, Ashdown S, Chan A (2007) Intelligent size table generation. Sen’i Gakkaishi (J Soc Fiber Sci Technol Jpn) 63(11):384–387Pennec X (2006) Intrinsic statistics on riemannian manifolds: basic tools for geometric measurements. J Math Imaging Vis 25(1):127–154Qiu W, Joe H (2013) ClusterGeneration: random cluster generation (with specified degree of separation. http://CRAN.R-project.org/package=clusterGeneration , R package version 1.3.1R Development Core Team (2014) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org , ISBN 3-900051-07-0Rohlf JF (1999) Shape statistics: Procrustes superimpositions and tangent spaces. J Classif 16:197–223S-plus original by Ulric Lund and R port by Claudio Agostinelli (2012) CircStats: Circular Statistics, from “Topics in circular Statistics” (2001). http://CRAN.R-project.org/package=CircStats , R package version 0.2–4Simmons K (2002) Body shape analysis using three-dimensional body scanning technology. PhD thesis, North Carolina State UniversitySmall C (1996) The statistical theory of shape. Springer, New YorkSokal R, Sneath PH (1963) Principles of numerical taxonomy. Freeman, San FranciscoSteinhaus H (1956) Sur la division des corps matĂ©riels en parties. Bull Acad Pol Sci IV(12):801–804Steinley D (2006) K-means clustering: a half-century synthesis. Br J Math Stat Psychol 59:1–34Stoyan LA, Stoyan H (1995) Fractals, random shapes and point fields. Wiley, ChichesterTheodoridis S, Koutroumbas K (1999) Pattern recognition. Academic, New YorkVeitch D, Fitzgerald C et al (2013) Sizing up Australia—the next step. Safe Work Australia, CanberraVinuĂ© G, Epifanio I, SimĂł A, Ibåñez MV, Domingo J, Ayala G (2014) Anthropometry: an R Package for analysis of anthropometric data. http://CRAN.R-project.org/package=Anthropometry , R package version 1.0Woods R (2003) Characterizing volume and surface deformations in an atlas framework: theory, applications, and implementation. NeuroImage 18:769–788Zheng R, Yu W, Fan J (2007) Development of a new chinese bra sizing system based on breast anthropometric measurements. Int J Ind Ergon 37:697–70
    corecore